Two sets of random spatial samples have been drawn from the city of La Plata
The samples were created using the Vector Research Tools -> Random points inside a polygon in QGIS1. Then the building rooftops were digitized in QGIS using Google Satellite Hybrid2, a Tile Map Service (TMS) layer.
For every sample the rooftop area \(m^2\), the mean global horizontal irradiation \((\frac{kWh}{m^2})\), the usable solar radiation \((kWh)\) and renewable electricity production \((kWh)\) were calculated.
Buildings rooftops area that are equal and under 30 \(m^2\) are defined as 0. Some of the sample points were computed on non built-up areas/roads/parks etc., therefore they were given a 0 to include a density factor in the calculation.
## Simple feature collection with 6 features and 7 fields
## Geometry type: POLYGON
## Dimension: XY
## Bounding box: xmin: -6480273 ymin: -4138361 xmax: -6455331 ymax: -4118445
## CRS: +proj=merc +lon_0=0 +k=1 +x_0=0 +y_0=0 +ellps=WGS84 +datum=WGS84 +units=m +no_defs
## Registered S3 method overwritten by 'cli':
## method from
## print.boxx spatstat.geom
## # A tibble: 6 x 8
## id area X_mean usable_sr elec_prod geometry
## <dbl> <dbl> <dbl> <dbl> <dbl> <POLYGON [m]>
## 1 3 91.0 1728. 157246. 20285. ((-6461504 -4132034, -6461500 -41320~
## 2 4 131. 1735. 227935. 29404. ((-6466451 -4118457, -6466438 -41184~
## 3 8 59.1 1736. 102639. 13240. ((-6465099 -4118684, -6465094 -41186~
## 4 1 0.189 1729. 327. 0 ((-6455332 -4132428, -6455331 -41324~
## 5 2 0.063 1725. 109. 0 ((-6480273 -4138360, -6480272 -41383~
## 6 5 0.077 1726. 133. 0 ((-6468889 -4135469, -6468889 -41354~
## # ... with 2 more variables: elec_prod_mwh <dbl>, elec_prod_gwh <dbl>
## id area X_mean usable_sr
## Min. : 1.00 Min. : 0.002 Min. :1721 Min. : 3
## 1st Qu.: 25.75 1st Qu.: 0.021 1st Qu.:1724 1st Qu.: 36
## Median : 50.50 Median : 0.078 Median :1727 Median : 134
## Mean : 50.50 Mean : 243.218 Mean :1727 Mean : 419697
## 3rd Qu.: 75.25 3rd Qu.: 0.240 3rd Qu.:1729 3rd Qu.: 413
## Max. :100.00 Max. :10405.603 Max. :1740 Max. :17939051
## elec_prod geometry elec_prod_mwh elec_prod_gwh
## Min. : 0 POLYGON :100 Min. : 0.00 Min. :0.00000
## 1st Qu.: 0 epsg:NA : 0 1st Qu.: 0.00 1st Qu.:0.00000
## Median : 0 +proj=merc...: 0 Median : 0.00 Median :0.00000
## Mean : 54056 Mean : 54.06 Mean :0.05406
## 3rd Qu.: 0 3rd Qu.: 0.00 3rd Qu.:0.00000
## Max. :2314138 Max. :2314.14 Max. :2.31414
## Simple feature collection with 6 features and 7 fields
## Geometry type: POLYGON
## Dimension: XY
## Bounding box: xmin: -6459285 ymin: -4157358 xmax: -6437314 ymax: -4133646
## CRS: +proj=merc +lon_0=0 +k=1 +x_0=0 +y_0=0 +ellps=WGS84 +datum=WGS84 +units=m +no_defs
## # A tibble: 6 x 8
## id area X_mean usable_sr elec_prod geometry
## <dbl> <dbl> <dbl> <dbl> <dbl> <POLYGON [m]>
## 1 1 90.4 1732. 156456. 20183. ((-6444962 -4133747, -6444957 -413374~
## 2 2 0.036 1722. 62.0 0 ((-6459285 -4157357, -6459285 -415735~
## 3 3 0.144 1730. 249. 0 ((-6444164 -4138267, -6444164 -413826~
## 4 4 0.065 1731. 112. 0 ((-6447967 -4135898, -6447966 -413589~
## 5 5 0.23 1732. 398. 0 ((-6445041 -4133647, -6445040 -413364~
## 6 6 0.212 1732. 367. 0 ((-6437315 -4138498, -6437314 -413849~
## # ... with 2 more variables: elec_prod_mwh <dbl>, elec_prod_gwh <dbl>
## id area X_mean usable_sr
## Min. : 1.00 Min. : 0.007 Min. :1721 Min. : 12
## 1st Qu.: 75.75 1st Qu.: 0.057 1st Qu.:1725 1st Qu.: 98
## Median :150.50 Median : 0.117 Median :1727 Median : 204
## Mean :150.50 Mean : 365.856 Mean :1728 Mean : 631475
## 3rd Qu.:225.25 3rd Qu.: 0.405 3rd Qu.:1730 3rd Qu.: 699
## Max. :300.00 Max. :13225.659 Max. :1740 Max. :22824922
## elec_prod geometry elec_prod_mwh elec_prod_gwh
## Min. : 0 POLYGON :300 Min. : 0.0 Min. :0.0000
## 1st Qu.: 0 epsg:NA : 0 1st Qu.: 0.0 1st Qu.:0.0000
## Median : 0 +proj=merc...: 0 Median : 0.0 Median :0.0000
## Mean : 81401 Mean : 81.4 Mean :0.0814
## 3rd Qu.: 0 3rd Qu.: 0.0 3rd Qu.:0.0000
## Max. :2944415 Max. :2944.4 Max. :2.9444
Calculation of the sample mean is represented by \(\overline{y}\). Calculation of the sample variance is represented by \(s\).
## [1] "The mean of sample 1: 54.06 (mWh)"
## [1] "The variance of sample 1: 78930.14 (mWh)"
The following equation calculates the unbiased variance of the estimator \(\overline{y}\):
\[\begin{equation} \hat{var}(\overline{y})= (\frac{N-n}{N})*(\frac{s^2}{n}) \tag{1.1} \end{equation}\]
The following equation calculates the estimated standard error of the estimator \(\overline{y}\):
\[\begin{equation} SEM = \sqrt{\hat{var}(\overline{y})} \tag{1.2} \end{equation}\]
## [1] "The variance of the sample mean: 757.5 (mWh)"
## [1] "The estimated standard error of the sample mean: 27.52 (mWh)"
The following equation calculates an unbiased estimator of the population total \(\hat{t}\): \[\begin{equation} \hat{t} = N*{\overline{y}} \tag{1.3} \end{equation}\]
The following equation calculates the unbiased variance of the estimator \(\hat{t}\):
\[\begin{equation} \hat{var}(\hat{t})= N^2*\hat{var}(\overline{y}) \tag{1.4} \end{equation}\]
The following equation calculates the estimated standard error of the estimator \(\hat{t}\): \[\begin{equation} SEM = \sqrt{\hat{var}(\hat{t})} \tag{1.5} \end{equation}\]
## [1] "The estimation of the renewable electricity production potential by all the buildings in the city: 134167.17 (mWh)"
## [1] "The variance of the estimated total: 4666447948.84 (mWh)"
## [1] "The estimated standard error of the total: 68311.4 (mWh)"
## [1] "The 95% confidence interval estimation for sample 1 is: (20743.51 (mWh), 247590.82 (mwh))"
Calculation of the sample mean is represented by \(\overline{y}\). Calculation of the sample variance is represented by \(s\).
## [1] "The mean of sample 2: 81.4 (mWh)"
## [1] "The variance of sample 2: 109098.9 (mWh)"
Equation (1.1) calculates the unbiased variance of the estimator \(\overline{y}\).
Equation (1.2) calculates the estimated standard error of the estimator \(\overline{y}\).
## [1] "The variance of the sample mean: 319.71 (mWh)"
## [1] "The estimated standard error of the sample mean: 17.88 (mWh)"
Equation (1.3) calculates an unbiased estimator of the population total \(\hat{t}\).
Equation (1.4) calculates the unbiased variance of the estimator \(\hat{t}\).
Equation (1.5) calculates the estimated standard error of the estimator \(\hat{t}\).
## [1] "The estimation of the renewable electricity production potential by all the buildings in the city: 202036.58 (mWh)"
## [1] "The variance of the estimated total: 1969498517.02 (mWh)"
## [1] "The estimated standard error of the total: 44379.03 (mWh)"
## [1] "The 95% confidence interval estimation for sample 2 is: (128812.7 (mWh), 275260.47 (mwh))"
The following calculations are for both sample 1 and sample 2 together, resulting in a total sample of 400 building rooftops. We will call this sample, sample 3.
Calculation of the sample mean is represented by \(\overline{y}\). Calculation of the sample variance is represented by \(s\).
## [1] "The mean of sample 3: 74.56 (mWh)"
## [1] "The variance of sample 3: 101480.54 (mWh)"
Equation (1.1) calculates the unbiased variance of the estimator \(\overline{y}\).
Equation (1.2) calculates the estimated standard error of the estimator \(\overline{y}\).
## [1] "The variance of the sample mean: 212.81 (mWh)"
## [1] "The estimated standard error of the sample mean: 14.59 (mWh)"
Equation (1.3) calculates an unbiased estimator of the population total \(\hat{t}\).
Equation (1.4) calculates the unbiased variance of the estimator \(\hat{t}\).
Equation (1.5) calculates the estimated standard error of the estimator \(\hat{t}\).
## [1] "The estimation of the renewable electricity production potential by all the buildings in the city: 185069.23 (mWh)"
## [1] "The variance of the estimated total: 1311007843.95 (mWh)"
## [1] "The estimated standard error of the total: 36207.84 (mWh)"
## [1] "The 95% confidence interval estimation for sample 3 is: (125374.03 (mWh), 244764.43 (mwh))"
La Plata is divided to 2 stratas based on satellite imagery provided by the Copernicus Land Monitoring Service global maps of land cover & cover changes and related surface area statistics3.
Strata 1 represents built up area in the city and Strata 2 represents non built up area.
For each strata 60 random points were computed and then 20 building rooftops were digitized using the same methods as in chapter 1.
## Simple feature collection with 6 features and 8 fields
## Geometry type: POLYGON
## Dimension: XY
## Bounding box: xmin: -6476517 ymin: -4140116 xmax: -6455221 ymax: -4125471
## CRS: +proj=merc +lon_0=0 +k=1 +x_0=0 +y_0=0 +ellps=WGS84 +datum=WGS84 +units=m +no_defs
## # A tibble: 6 x 9
## id area X_mean usable_sr elec_prod geometry landuse
## <dbl> <dbl> <dbl> <dbl> <dbl> <POLYGON [m]> <chr>
## 1 1 0.004 1725. 6.90 0 ((-6467786 -4139229, -64677~ Built ~
## 2 2 834. 1727. 1440503. 185825. ((-6476517 -4129667, -64764~ Built ~
## 3 3 0.143 1727. 247. 0 ((-6459586 -4135615, -64595~ Built ~
## 4 4 69.1 1728. 119366. 15398. ((-6455235 -4134598, -64552~ Built ~
## 5 5 0.057 1726. 98.4 0 ((-6464188 -4140116, -64641~ Built ~
## 6 6 0.06 1728. 104. 0 ((-6471555 -4125471, -64715~ Built ~
## # ... with 2 more variables: elec_prod_mwh <dbl>, elec_prod_gwh <dbl>
## id area X_mean usable_sr
## Min. : 1.00 Min. : 0.001 Min. :1721 Min. : 2
## 1st Qu.:15.75 1st Qu.: 0.016 1st Qu.:1727 1st Qu.: 27
## Median :30.50 Median : 0.039 Median :1729 Median : 68
## Mean :30.50 Mean : 143.529 Mean :1730 Mean : 247773
## 3rd Qu.:45.25 3rd Qu.: 0.078 3rd Qu.:1734 3rd Qu.: 134
## Max. :60.00 Max. :11988.234 Max. :1740 Max. :20680615
## elec_prod geometry landuse elec_prod_mwh
## Min. : 0 POLYGON :120 Length:120 Min. : 0.00
## 1st Qu.: 0 epsg:NA : 0 Class :character 1st Qu.: 0.00
## Median : 0 +proj=merc...: 0 Mode :character Median : 0.00
## Mean : 31870 Mean : 31.87
## 3rd Qu.: 0 3rd Qu.: 0.00
## Max. :2667799 Max. :2667.80
## elec_prod_gwh
## Min. :0.00000
## 1st Qu.:0.00000
## Median :0.00000
## Mean :0.03187
## 3rd Qu.:0.00000
## Max. :2.66780
After stratification, the stratas are combined to one sample and the computations are the same as for random sampling.
## landuse elec_prod_mwh
## 1 Built up 63.73959
## 2 Non built up 0.00000
## [1] "The mean of the stratfied sample is: 31.87 (mWh)"
## [1] "The variance of the stratfied sample is: 62862.69 (mWh)"
Equation (1.1) calculates the unbiased variance of the estimator \(\overline{y}\).
Equation (1.2) calculates the estimated standard error of the estimator \(\overline{y}\).
## [1] "The variance of the sample mean: 498.53 (mWh)"
## [1] "The estimated standard error of the sample mean: 22.33 (mWh)"
Equation (1.3) calculates an unbiased estimator of the population total \(\hat{t}\).
Equation (1.4) calculates the unbiased variance of the estimator \(\hat{t}\).
Equation (1.5) calculates the estimated standard error of the estimator \(\hat{t}\).
## [1] "The estimation of the renewable electricity production potential by all the buildings in the city: 79100.84 (mWh)"
## [1] "The variance of the estimated total: 3071095766.73 (mWh)"
## [1] "The estimated standard error of the total: 55417.47 (mWh)"
## [1] "The 95% confidence interval estimation for the stratified sample is: (-12767.99 (mWh), 170969.66 (mwh))"
When dividing the city of La Plata into built up and non built up area, the sample calculations from the built up area present more variable strata.
Therefore, 2 new stratas are computed -
## Simple feature collection with 6 features and 8 fields
## Geometry type: POLYGON
## Dimension: XY
## Bounding box: xmin: -6463597 ymin: -4130860 xmax: -6448816 ymax: -4123499
## CRS: +proj=merc +lon_0=0 +k=1 +x_0=0 +y_0=0 +ellps=WGS84 +datum=WGS84 +units=m +no_defs
## # A tibble: 6 x 9
## id area X_mean usable_sr elec_prod geometry landuse
## <dbl> <dbl> <dbl> <dbl> <dbl> <POLYGON [m]> <chr>
## 1 1 175. 1732. 303058. 39094. ((-6452163 -4130836, -645215~ Built ~
## 2 2 91.7 1735. 159086. 20522. ((-6448832 -4127833, -644882~ Built ~
## 3 3 0.066 1739. 115. 0 ((-6457547 -4125248, -645754~ Built ~
## 4 4 0.006 1739. 10.4 0 ((-6457997 -4123499, -645799~ Built ~
## 5 5 54.8 1731. 94900. 12242. ((-6463593 -4124949, -646358~ Built ~
## 6 6 56.2 1735. 97564. 12586. ((-6456051 -4129352, -645604~ Built ~
## # ... with 2 more variables: elec_prod_mwh <dbl>, elec_prod_gwh <dbl>
## id area X_mean usable_sr
## Min. : 1.00 Min. : 0.001 Min. : 0 Min. : 0
## 1st Qu.:20.75 1st Qu.: 0.006 1st Qu.:1729 1st Qu.: 10
## Median :45.50 Median : 0.021 Median :1733 Median : 36
## Mean :45.75 Mean : 131.517 Mean :1716 Mean : 227277
## 3rd Qu.:70.25 3rd Qu.: 53.038 3rd Qu.:1736 3rd Qu.: 91834
## Max. :95.00 Max. :6109.904 Max. :1741 Max. :10546739
## elec_prod geometry landuse elec_prod_mwh
## Min. : 0 POLYGON :100 Length:100 Min. : 0.00
## 1st Qu.: 0 epsg:NA : 0 Class :character 1st Qu.: 0.00
## Median : 0 +proj=merc...: 0 Mode :character Median : 0.00
## Mean : 29316 Mean : 29.32
## 3rd Qu.: 11847 3rd Qu.: 11.85
## Max. :1360529 Max. :1360.53
## elec_prod_gwh
## Min. :0.00000
## 1st Qu.:0.00000
## Median :0.00000
## Mean :0.02932
## 3rd Qu.:0.01185
## Max. :1.36053
After stratification, the stratas are combined to one sample and the computations are the same as for random sampling.
## landuse elec_prod_mwh
## 1 Built up 30.85859
## 2 Non built up 0.00000
## [1] "The mean of the stratfied sample is: 29.32 (mWh)"
## [1] "The variance of the stratfied sample is: 25362.42 (mWh)"
Equation (1.1) calculates the unbiased variance of the estimator \(\overline{y}\).
Equation (1.2) calculates the estimated standard error of the estimator \(\overline{y}\).
## [1] "The variance of the sample mean: 498.53 (mWh)"
## [1] "The estimated standard error of the sample mean: 15.6 (mWh)"
Equation (1.3) calculates an unbiased estimator of the population total \(\hat{t}\).
Equation (1.4) calculates the unbiased variance of the estimator \(\hat{t}\).
Equation (1.5) calculates the estimated standard error of the estimator \(\hat{t}\).
## [1] "The estimation of the renewable electricity production potential by all the buildings in the city: 72761.47 (mWh)"
## [1] "The variance of the estimated total: 1499457748.65 (mWh)"
## [1] "The estimated standard error of the total: 38722.83 (mWh)"
## [1] "The 95% confidence interval estimation for the stratified sample is: (8466.42 (mWh), 137056.52 (mwh))"
| Sample | Mean | Var of Mean | SEM | Total Estimation | Var of Total | SET | Lower CI | Upper CI |
|---|---|---|---|---|---|---|---|---|
| Sample 1 - 100 | 54.06 | 757.50 | 27.52 | 134,167.17 | 4,666,447,949 | 68,311.40 | 20,743.51 | 247,590.8 |
| Sample 2 - 300 | 81.40 | 319.71 | 17.88 | 202,036.58 | 1,969,498,517 | 44,379.03 | 128,812.70 | 275,260.5 |
| Both Samples - 400 | 74.56 | 212.81 | 14.59 | 185,069.23 | 1,311,007,844 | 36,207.84 | 125,374.03 | 244,764.4 |
| Stratified Sample - 120 | 31.87 | 498.53 | 22.33 | 79,100.84 | 3,071,095,767 | 55,417.47 | -12,767.99 | 170,969.7 |
| Allocated Stratified Sample - 100 | 29.32 | 243.41 | 15.60 | 72,761.47 | 1,499,457,749 | 38,722.83 | 8,466.42 | 137,056.5 |